AITopics | small dataset

Collaborating Authors

small dataset

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

1b80fe066fdbceb3a2960117bac33917-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 12:57:51 GMT

artificial intelligence, gradient, machine learning, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.72)

Add feedback

TabEBM: A Tabular Data Augmentation Method with Distinct Class-Specific Energy-Based Models

Neural Information Processing SystemsMar-21-2026, 09:46:10 GMT

Data collection is often difficult in critical fields such as medicine, physics, and chemistry, yielding typically only small tabular datasets. However, classification methods tend to struggle with these small datasets, leading to poor predictive performance. Increasing the training set with additional synthetic data, similar to data augmentation in images, is commonly believed to improve downstream tabular classification performance. However, current tabular generative methods that learn either the joint distribution $ p(\mathbf{x}, y) $ or the class-conditional distribution $ p(\mathbf{x} \mid y) $ often overfit on small datasets, resulting in poor-quality synthetic data, usually worsening classification performance compared to using real data alone. To solve these challenges, we introduce TabEBM, a novel class-conditional generative method using Energy-Based Models (EBMs). Unlike existing tabular methods that use a shared model to approximate all class-conditional densities, our key innovation is to create distinct EBM generative models for each class, each modelling its class-specific data distribution individually. This approach creates robust energy landscapes, even in ambiguous class distributions. Our experiments show that TabEBM generates synthetic data with higher quality and better statistical fidelity than existing methods. When used for data augmentation, our synthetic data consistently leads to improved classification performance across diverse datasets of various sizes, especially small ones.

artificial intelligence, machine learning, proceedings, (10 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Dynamic Neural Regeneration: Enhancing Deep Learning Generalization on Small Datasets

Neural Information Processing SystemsMar-21-2026, 04:43:09 GMT

The efficacy of deep learning techniques is contingent upon access to large volumes of data (labeled or unlabeled). However, in practical domains such as medical applications, data availability is often limited. This presents a significant challenge: How can we effectively train deep neural networks on relatively small datasets while improving generalization? Recent works have explored evolutionary or iterative training paradigms, which reinitialize a subset of parameters to enhance generalization performance for small datasets. However, these methods typically rely on randomly selected parameter subsets and maintain fixed masks throughout training, potentially leading to suboptimal outcomes. Inspired by neurogenesis in the brain, we propose a novel iterative training framework, Dynamic Neural Regeneration (DNR), that employs a data-aware dynamic masking scheme to eliminate redundant connections by estimating their significance. This approach increases the model's capacity for further learning through random weight reinitialization. Experimental results demonstrate that our approach outperforms existing methods in accuracy and robustness, highlighting its potential for real-world applications where data collection is challenging.

artificial intelligence, machine learning, proceedings, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Patch Diffusion: Faster and More Data-Efficient Training of Diffusion Models

Neural Information Processing SystemsFeb-17-2026, 15:45:19 GMT

Diffusion models are powerful, but they require a lot of time and data to train. We propose Patch Diffusion, a generic patch-wise training framework, to significantly reduce the training time costs while improving data efficiency, which thus helps democratize diffusion model training to broader users. At the core of our innovations is a new conditional score function at the patch level, where the patch location in the original image is included as additional coordinate channels, while the patch size is randomized and diversified throughout training to encode the cross-region dependency at multiple scales. Sampling with our method is as easy as in the original diffusion model.

artificial intelligence, diffusion model, machine learning, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Industry: Media (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

DynamicNeuralRegeneration: EnhancingDeep LearningGeneralizationonSmallDatasets

Neural Information Processing SystemsFeb-15-2026, 23:04:03 GMT

Recent works have explored evolutionary or iterativetraining paradigms, which reinitialize asubset ofparameters toenhance generalization performance forsmalldatasets.

artificial intelligence, dataset, machine learning, (17 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.93)

Industry: Health & Medicine (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)

Add feedback

47ed62021460f2e9bba7be3e74260090-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 17:16:10 GMT

dataset, interpolation, tl lattice, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.74)

Add feedback

Appendix of RECESS A Additional Related Works A.1 Federated Learning FedAvg. FedAvg [

Neural Information Processing SystemsFeb-8-2026, 13:15:25 GMT

The aggregation gradient is a weighted average of each client's upload gradient, and the weight is determined by the number of However, the aggregation gradient, i.e., the global model, is vulnerable to poisoning From the perspective of the attacker's goal, poisoning attacks are categorized as targeted and untar-geted attacks. Note that Mkrum is Krum when m = 1, and Mkrum is FedAvg when m = n . FL Trust involves the server with a small dataset to participate in each iteration and generate a gradient benchmark in each iteration. FL Trust would discard benign outliers. All clients just follow normal FL training without any extra rules to obey.

artificial intelligence, gradient, machine learning, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

181a027913d36bc0a8857c0da661d621-Paper-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 09:06:50 GMT

dataset, representation, small dataset, (13 more...)

Neural Information Processing Systems

Country:

Asia > China > Beijing > Beijing (0.04)
Asia > China > Shanghai > Shanghai (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

Expanding Small-Scale Datasets with Guided Imagination

Neural Information Processing SystemsDec-27-2025, 04:39:24 GMT

The power of DNNs relies heavily on the quantity and quality of training data. However, collecting and annotating data on a large scale is often expensive and time-consuming. To address this issue, we explore a new task, termed dataset expansion, aimed at expanding a ready-to-use small dataset by automatically creating new labeled samples. To this end, we present a Guided Imagination Framework (GIF) that leverages cutting-edge generative models like DALL-E2 and Stable Diffusion (SD) to imagine and create informative new data from the input seed data. Specifically, GIF conducts data imagination by optimizing the latent features of the seed data in the semantically meaningful space of the prior model, resulting in the creation of photo-realistic images with new content. To guide the imagination towards creating informative samples for model training, we introduce two key criteria, i.e., class-maintained information boosting and sample diversity promotion. These criteria are verified to be essential for effective dataset expansion: GIF-SD obtains 13.5% higher model accuracy on natural image datasets than unguided expansion with SD. With these essential criteria, GIF successfully expands small datasets in various scenarios, boosting model accuracy by 36.9% on average over six natural image datasets and by 13.5% on average over three medical datasets.

expanding small-scale dataset, guided imagination, name change, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.76)

Add feedback

Stochastic Normalization

Neural Information Processing SystemsDec-24-2025, 12:47:46 GMT

Fine-tuning pre-trained deep networks on a small dataset is an important component in the deep learning pipeline. A critical problem in fine-tuning is how to avoid over-fitting when data are limited. Existing efforts work from two aspects: (1) impose regularization on parameters or features; (2) transfer prior knowledge to fine-tuning by reusing pre-trained parameters. In this paper, we take an alternative approach by refactoring the widely used Batch Normalization (BN) module to mitigate over-fitting. We propose a two-branch design with one branch normalized by mini-batch statistics and the other branch normalized by moving statistics.

artificial intelligence, machine learning, proceedings, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.40)

Add feedback